Optimizing Graph Algorithms on Pregel-like Systems
نویسندگان
چکیده
We study the problem of implementing graph algorithms efficiently on Pregel-like systems, which can be surprisingly challenging. Standard graph algorithms in this setting can incur unnecessary inefficiencies such as slow convergence or high communication or computation cost, typically due to structural properties of the input graphs such as large diameters or skew in component sizes. We describe several optimization techniques to address these inefficiencies. Our most general technique is based on the idea of performing some serial computation on a tiny fraction of the input graph, complementing Pregel’s vertex-centric parallelism. We base our study on thorough implementations of several fundamental graph algorithms, some of which have, to the best of our knowledge, not been implemented on Pregel-like systems before. The algorithms and optimizations we describe are fully implemented in our open-source Pregel implementation. We present detailed experiments showing that our optimization techniques improve runtime significantly on a variety of very large graph datasets.
منابع مشابه
Mizan: Optimizing Graph Mining in Large Parallel Systems
Extracting information from graphs, from finding shortest paths to complex graph mining, is essential for many applications. Due to the shear size of modern graphs (e.g., social networks), processing must be done on large parallel computing infrastructures (e.g., the cloud). Earlier approaches relied on the MapReduce framework, which was proved inadequate for graph algorithms. More recently, th...
متن کاملAn Experimental Comparison of Pregel-like Graph Processing Systems
The introduction of Google’s Pregel generated much interest in the field of large-scale graph data processing, inspiring the development of Pregel-like systems such as Apache Giraph, GPS, Mizan, and GraphLab, all of which have appeared in the past two years. To gain an understanding of how Pregel-like systems perform, we conduct a study to experimentally compare Giraph, GPS, Mizan, and GraphLab...
متن کاملFrom "Think Like a Vertex" to "Think Like a Graph"
To meet the challenge of processing rapidly growing graph and network data created by modern applications, a number of distributed graph processing systems have emerged, such as Pregel and GraphLab. All these systems divide input graphs into partitions, and employ a “think like a vertex” programming model to support iterative graph computation. This vertex-centric model is easy to program and h...
متن کاملTech Report: Compiling GreenMarl into GPS
The massive size of the data in large graph processing requires distributed processing. However, conventional frameworks for distributed graph processing, such as Pregel, use programming models that are well-suited for scalability but inconvenient for programming graph algorithms. In this paper, we use Green-Marl, a Domain-Specific Language for graph analysis, to describe graph algorithms intui...
متن کاملPregel Algorithms for Graph Connectivity Problems with Performance Guarantees
Graphs in real life applications are often huge, such as the Web graph and various social networks. These massive graphs are often stored and processed in distributed sites. In this paper, we study graph algorithms that adopt Google’s Pregel, an iterative vertexcentric framework for graph processing in the Cloud. We first identify a set of desirable properties of an efficient Pregel algorithm, ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- PVLDB
دوره 7 شماره
صفحات -
تاریخ انتشار 2014